Mixed precision <i>s</i> ?step Lanczos and conjugate gradient algorithms

نویسندگان

چکیده

Compared to the classical Lanczos algorithm, s-step variant has potential improve performance by asymptotically decreasing synchronization cost per iteration. However, this comes at a price; despite being mathematically equivalent, may behave quite differently in finite precision, potentially exhibiting greater loss of accuracy and slower convergence relative algorithm. It previously been shown that errors version follow same structure as but are amplified factor depending on square condition number O ( s ) -dimensional Krylov bases computed each outer loop. As these grows (in some cases very quickly) with s, limits values can be chosen thus limit attainable performance. In work, we show if select few computations performed double working error terms then depend only linearly conditioning bases. This for drastically improving numerical behavior algorithm little impact per-iteration Our experiments demonstrate improved possible mixed precision approach, also extends CG. We present preliminary results NVIDIA V100 GPUs overhead extra is minimal one uses precisions implemented hardware.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Implementation of Conjugate Gradient Algorithms

We evaluate the High-Performance Fortran (HPF) language for the compact expression and eecient implementation of conjugate gradient iterative matrix-solvers on High Performance Computing and Communi-cations(HPCC) platforms. We discuss the use of intrinsic functions, data distribution directives and explicitly parallel constructs to optimize performance by minimizing communications requirements ...

متن کامل

An K - Step Preconditioned Conjugate Gradient Hethod

Tbis paper describes a preconditioned conjugate gradient method tbat can be effectively implemented on both vector machines and parallel arrays to solve sparse symmetric and positive definite systema of linear equations. The implementation on the CYBER 203/205 and on tbe Finite Element Machine is discuased and result a obtained using the method on these machines are siven.

متن کامل

The Adaptive s-step Conjugate Gradient Method

On modern large-scale parallel computers, the performance of Krylov subspace iterative methods is limited by global synchronization. This has inspired the development of s-step (or communication-avoiding) Krylov subspace method variants, in which iterations are computed in blocks of s. This reformulation can reduce the number of global synchronizations per iteration by a factor of O(s), and has...

متن کامل

GPU-Based Parallel Nonlinear Conjugate Gradient Algorithms

In this paper we describe some parallel algorithms for solving nonlinear systems using CUDA (Compute Unified Device Architecture) over a GPU (Graphics Processing Unit). The proposed algorithms are based on both the Fletcher-Reeves version of the nonlinear conjugate gradient method and a polynomial preconditioner type based on block two-stage methods. Several strategies of parallelization and di...

متن کامل

Set-Membership Constrained Conjugate Gradient Beamforming Algorithms

In this work a constrained adaptive filtering strategy based on conjugate gradient (CG) and set-membership (SM) techniques is presented for adaptive beamforming. A constraint on the magnitude of the array output is imposed to derive an adaptive algorithm that performs data-selective updates when calculating the beamformer’s parameters. We consider a linearly constrained minimum variance (LCMV) ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Numerical Linear Algebra With Applications

سال: 2021

ISSN: ['1070-5325', '1099-1506']

DOI: https://doi.org/10.1002/nla.2425